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^^ ■ We consider force-induced unzipping transition for a heterogeneous DNA model with a correlated 

^^ ' base-sequence. Both finite-range and long-range correlated situations are considered. It is shown 

^N ^ that finite-range correlations increase stability of DNA with respect to the external unzipping force. 

(~| ■ Due to long-range correlations the number of unzipped base-pairs displays two widely different 

^ ' scenarios depending on the details of the base-sequence: either there is no unzipping phase-transition 

^~~i , at all, or the transition is realized via a sequence of jumps with magnitude comparable to the size 

r^v . of the system. Both scenarios are different from the behavior of the average number of unzipped 

fvj ' base-pairs (non-self-averaging). The results can be relevant for explaining the biological purpose of 

I correlated structures in DNA. 

O ■ I- INTRODUCTION. 

j^ I Structural transformations of DNA under changing of external conditions are of importance for molecular biologjsi 

ri . and biophysics^. They take place in transcription of genetic information from DNA and in duplication of DNA during 

I ' cell divisioni. The common scenario of these processes is unwinding of the double-stranded structure of DNA under 

'^ influence of external forces. Recall that a deoxyribonucleic acid (DNA) consists of two strands with one winded around 

Ch the other. These two strands interact via hydrogen bonds due to which the double-helix structure is formed. The 

individual strand is constructed by covalent bonds whose strength is thus much larger than the inter-strand coupling. 

Each strand is a polymer based on nucleotides. A nucleotide is a deoxyribose sugar molecule bearing on one side 

purine or pyrimidine group (the base) and on the other a phosphate group. The purines can be of two type: adenine 

(A) and guanine (G), whereas pyrimidines are cytosine (C) and thymine (T) (an additional purine uracil (U) is found 

^ ' in ribonucleic acid (RNA)). A, G, C and T groups differentiate the nucleotides and constitute the genetic code carried 

by a DNA molecule. The bounds between neighboring nucleotides within one strand are formed via the corresponding 

r — ^ ' phosphate groups. Hydrogen bonds between opposite strands are formed either by A-T bases or by G-C bases. Since 

\^ , the bases A, G, C and T are hydrophobic, they are located at the core of the double-helix. In contrast, the sugar 

(T^ ' molecules and the phosphate groups are hydrophilic and they are located in the outside part of the DNA molecule. 

■^ ] Thus in a regular DNA molecule the letters of the genetic code are hidden from the molecular environment. This 

^p ' appears as a problem for the polymerase enzymes whose role is to read the genetic code. The polymerase may function 

if only they unzip the needed part of the DNA molecule, so that the bases are exposed to the environment. This is 

the main reason why DNA unzipping, in particular, unzipping under an external force is important for functioning of 

all living organisms. Force-induced unzipping has been actively investigated only recently '^■*-^:—^ motivated by the 

n-^ , new generation of micromanipulation experiments i^. 

^ ■ It is expected that features of the unzipping process depend on the base-sequence of DNA, because AT and GC 

Q , base-pairs do have different formation energies. It is more difficult to break a single GC base-pair, since it is made of 

O ' three hydrogen bonds, while a single AT base-pair is made of two hydrogen bonds only. Thus, the formation energy 

difference between AT and GC base-pairs is of the order of one hydrogen bond energy, that is, 0.1 — 0.2 eV. This is 

comparable with the average formation energy itself. We note in addition that for a given DNA molecule the overall 

concentrations of AT and GC base-pairs are approximately equai. This is especially true for higher organisms, e.g., 

the concentration of GC base-pairs for primates is between 49 and 51%i. 

The above energy difference may not be relevant for certain bulk properties of DNA. Therefore, the latter is 
frequently modeled assuming a homogeneous base-sequence. However, in natural conditions the energy supplied for 
uzipping can be comparable to the average formation energy, and then the heterogeneous character of the base- 
sequence becomes relevant. One of the first steps in this direction was made in^, where it was shown that short-range 
heterogenity does influence the unzipping process in the region where the energy supplied by an external unzipping 
force is comparable to the average formation energy of a DNA base-pair. 

Our main purpose is to make the next step towards real DNAs and to analyze force-induced unzipping for a DNA- 
model, where the structural features of the base-sequence are taken into account. One of the known features of DNA 
is that its base-sequence displays substantial correlations which, in particular, can be of long-range charactersSiiSiiLiS: 
two base pairs separated from each other by thousands of pairs appear to be statistically correlated. Initial studies 
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reported long-range correlations for non-coding regions of DNA (introns). For higher organisms, e.g. humans, these 
regions constitute more than 90% of DNA^. It was believed for some time that coding regions, which carry the 
majority of genetic information, can have only short-range correlations. However, more recent results indicate on the 
existence of weak long-range correlations in coding regions as welUi (this point was controversial for a while, but 
the general consensus on its validity emerged gradually). Moreover, systematic changes were found in the structure 
of correlations depending on the evolutionary category of the DNA carrier^^. In spite of ubiquity of long-range 
correlations, their biological reason remains largely unexplored. Some attempts in this direction were made in — , 
where it was studied why long-range correlations are absent in certain biologically active proteins. 

Our basic purpose in the present paper will be to determine how statistical correlations, in particular long-range 
correlations, influence on the unzipping process. Due to the biological relevance of unzipping, indications of such 
influences can provide useful information for explaining the presence of long-range correlations in DNA. 

This paper is organized as follows. The basic model we work with is described in section^ The situation with 
finite-range correlated base-sequence is investigated in section ITlIAl The next three sections study various aspects of 
the long-range correlated situation. We conclude with a summary of our results. Several technical points are outlined 
in appendices. 

II. THE MODEL 

There are three basic mechanisms which determine the physics of the unzipping process: An external force tending 
to unbind the double-helix structure of a DNA molecule, thermal noise generated by an equilibrium environment 
into which the molecule is embedded, and finally structural features of the molecule itself. Among various structural 
features which may be of relevance, the most important ones are connected with the base-sequence of the molecule. 

We shall work with a model which takes into account these three physical ingredients in the most minimal way. It 
was recently proposed in Ref»S, for studying DNA unzipping. 

i) A DNA molecule is lying along the x-axis between the points x — a and x — L. 

ii) Among all degrees of freedom of the molecule we consider only base-pairs; they are located at points Xi, 
a < Xi < L, i — 1, ..., Af . Indeed, for that range of external force where the molecule is close to be unbind completely, 
those degrees of freedom which are related to hydrogen bonds have much shorter characteristic times as compared to 
other degrees of freedom. The latter ones can therefore be considered as adiabatically frozen, and excluded from the 
effective description we are developing. 

Hi) Any base-pair can be in one of two states: bound or disconnected (broken). We choose the overall energy scale 
in such a way that the latter case contributes to the Hamiltonian a binding energy (j){xi), whereas the former case 
brings nothing. As we stressed in the introduction, different types of base-pairs do have different binding energies: 
even when considering the ideal situation, where there are no "wrong base-pairs" such as AC and GT, the "correct" 
base- pairs AT and GC are different with respect to energy needed to unbind them. Thus 4'{xi) is a random quantity 
with an average (0): 

<j>{x,)^{<f>)+r,ix,). (1) 

iv) An external force is acting on the left end a; = a of the molecule pulling apart the two strands. Thus, if a 
bond Xi is broken, all the base-pairs Xj with j < i are broken as well. Each broken bond brings additionally to the 
Hamiltonian a term —J-, where J-' is proportional to the acting force. 

v) Summarizing all of these, one comes to the Hamiltonian 

X X 

H{x) ^-Tx + Y. '^C^^) = (('^) - •^)^ + E ''(^')' (2) 

where x is the number of broken base-pairs. 

In the thermodynamical limit, where L, M 3> 1, one applies the continiuum description with x being a real number, 
a < X < L, and ends up with the following Hamiltonian: 



Hix) = ix-a)f+ / dsrjis), (3) 

J a 

where / — {(j)) — T and /3 = 1/T is the inverse temperature (fee — 1). 

vi) For characteristic time-scales of unzipping experiments we can certainly neglect any changes of the base-sequence 
for a single DNA molecule. Thus, once it is modeled via the random noise ry, it is legitimate to assume that this 
noise is frozen, i.e. its single realization corresponds to a single molecule. It is assumed that the DNA molecule is 



embedded into a thermal bath with temperature T, and had sufficient time to reach equihbrium. Thus, the partition 
function and the free energy corresponding to the Hamiltonian ^ read: 

Z=f dxe-^"^''\ F=-T\nZ. (4) 



These quantities are still random together with rj. Average results of many experiments with various realizations of rj 
can be described with help of the average free energy (F) . Our order parameter is the number of broken base-pairs 
X. Along with its average it is defined for t = as 

X = dfF, {X)^df{F). (5) 

A. Finite-range and long-range correlated situations. 

vi) It remains to specify the properties of the noise rj. Within the adopted description we assume it is a Gaussian 
stationary process with an autocorrelation function 

K{t-t') = {Tj{t),j{t')}, K{t)=K{-t). (6) 

Two major classes can now be distinguished depending on the behavior of K{t) for large t. The finite-range 
correlated situation is defined by requiring that the integral 

/■OO 

D= dsK{s), (7) 

determining the total intensity of the noise is finite. There are three particular case of the finite-range correlated 
situation. The white noise case, 

K{t)=D5{t), (8) 

describes completely uncorrelated noise. The physical situation given by |(21IH|) is well known, and was used to describe 
interfaces, random walks in a disordered media, and population dynamicsi^. It was recently applied for the unzipping 
transition in DNA^. Similar models were considered in-'--:. 

The second case corresponds to the noise having some finite — though possibly large — correlation length r. The 
simplest and most widely used model for this case is provided by Ornstein-Uhlenbeck (OU) noise 

K{t) = -e-\'\'-, (9) 

T 

where D is the total intensity of the noise, and r is the correlation time; r — > corresponds to the white noise. The 
third case is when K{t) has a power-law dependence for large t, but still decays sufficiently quickly so that the integral 
in iQ is finite: K{t) oc |i|~'' with 5 > I. 

The second major class is the long-range correlated situation, where the integral in Q is infinite, that is when K{t) 
for sufficiently large t behaves according to a power law^^: 

i^(i)^(,7(tM0))=a|r", (10) 

where 

< a < 1 (11) 

is the exponent characterizing the long-range correlation, and where a is the (local) intensity. Note that K{t) has to 
be regular and finite for small t^'^, as one would expect from physical reasons. 

The OU noise Q, as the typical representative of the finite-range correlated situation, and the long-range correlated 
noise (jlOjl are relevant for modeling correlations in base-sequence of DNA^tiSiiiiiSiii. Note, however, that the real 
noise distributions in DNA can be much more complicatectfi*ii. In particular, this concerns the Gaussian property we 
assume (see in this context section IVAI where we study a model of a non-Gaussian noise to show that its predictions 
in the thermodynamic limit do not differ from those given by the corresponding Gaussian noise). For the long range 
correlated situation there can exist several characteristic exponents for different ranges of t. Nevertheless, Eqs. ^ 
I10|l are certainly the minimal models of noise which are sufficiently simple and which allow to study both finite and 
long range correlations. 



B. Reduction to Langevin equation. 

The basic method of solving the present model will be to reduce it to the physics of a Brownian particle whose 
dynamics is described by a stochastic differential equation. In Eq. |0J one fixes L, and views a as a parameter varying 
from the highest possible value L, where Z = 0, to the lowest possible value which we define to be a = 0. The quantity 
t = —a will thus monotonicaly increase and can be interpreted as a time- variable. Differentiating Z in Q over a and 
changing the variable as t = —a, one gets: 

^ = l-(3fZ-pr,{t)Z, -L<t<0 (12) 

where we used ri{t) = ri{—t), as follows from the Gaussian stationary property of the noise. This is a Langevin 
equation with a multiplicative noise. From H12|l one can obtain a stochastic equation for F — —TlnZ: 

d F 

^ + r{F) = 7j{t), V{F)=T'e^^-fF. (13) 

This is the basic stochastic equation we will work with. 

III. FINITE RANGE CORRELATED NOISE. 

A. Ornstein-Uhlenbeck noise. 

Our main purpose here is to study the process of unzipping in the presence of the finite range corelated noise given 
by Eq. . We wish to understand how the magnitude of t influences unzipping. 
Note that the OU noise ^ can itself be modeled via a white-noise: 



Tr, = -r] + VDm, (14) 

where ^(t) is a Gaussian noise with delta-correlated spectrum: 

{mm)^m-t'). (15) 

Indeed, Eq. @ is recovered directly from H14II15() . since their exact solution is: 

( v{t)rj{t') ) = e-(*+*')/^ ({ v\0) )-^\+^ e-l'-*'l/^ (16) 

We get back from here Eq. ^ under an additional consistency condition (77^(0)) = D/t. Moreover, 'ri{t) is Gaussian 
random process, because ^{t) is Gaussian and Eq. H14|l is linear^^. 

To handle ifT^ one differentiates it over t and uses (jTHIT^ . Changing the variable as s = t/V^' one getsi^: 



d^F . _. dF _,._ . VD 

T 

where 



d.^ ■^(^)d7--^'(^) + 7vi^(^)' (^^) 



-f{F) = T-^^^ + V"{F)T^/^. (18) 

Eq. 1)17(1 has the same form as a Langevin equation for a particle with unit mass in the potential V{F) and subjected 
to a white noise and a F-dependent friction with a coeffcient "fiF). Note that the potential V{F) is confining only 
for / > 0: V{F) -^ 00 for F -^ ±cx3. 

We can rewrite Eq. H17() introducing an additional variable F{s) = dF{s)/ds, which in the above language of the 
Brownian motion corresponds to the velocity. 



^ = -7(F) F - V'{F) + ^m. (20) 



As ^(i) is a Gaussian white noise, one uses the standard tools, see e.gi^, and writes down from (|19l I20() a Fokker- 
Planck-Klein-Kramers equation for the common probabihty distribution: 

P(F, F,s) = { S{F - F{s)) d{F - F{s)) ), (21) 

where F{s) and F{s) are particular noise-dependent solutions of Ijiyi I2U|) . and where the average is taken over the 
white noise £,{t) given by Eq. (|15|l . 

dP{F,F,s) _ ~ dP{F,F,s) ^ djFPjF^F^s)] ^^^ dP{F,F,s) ^ D dPiF,J,s) 

ds dF ^^ ' dF dF t^/^ Qp^ 

Our interest is in the large-s limit of this equation (thermodynamic limit), and we want to have the reduced probability 
distribution P{F, s) of F only: 



P{F,s) ^ {SiF - F{s))) = J dFP{F,F,s). (23) 

To this end let us introduce 

Qn{F,s) = JdFF''PiF,F,s) n = 0,l,2,3.., (24) 

where Qo(s) = P{F, s). From l|22|l one gets an infinite set of coupled equations for Qn{F, s): 

dQoiF,s) dQi{F,s) 

(25) 



ds 


dF 


dQi{F,s) 


dQ2{F,s) 


ds 


dF 


dQ2{F,s) 


dQ3{F,s) 


ds 


dF 



jiF)Q,{F,s)~V'iF)Qo{F,s), (26) 

ID 

T/2 



2D 
- 2V'{F) Q^{F, s) ~ 27(F) Q2(F, s) + ^j. Qo{F, s), (27) 



(28) 

When deriving H25l - I26|l we used integration by parts, and the following standard boundary conditions: 

P{F, F, s) -> 0, if F ^ ±00, or if F -^ ±cx3. (29) 

These conditions are physically meaningful if the potential V{F) is confining, and thus the motion of the corresponding 
Brownian particle takes place in a finite domain. According to the above discussion on the confining character of the 
potential V{F) = T^e^^ — fF, the boundary conditions (|29|l are reliable only for / > 0. 

Recall that the "time-variable" t moves between —\L\ and 0. For large lengths, i.e. for L ^ 1 (thermodynamic 
limit) and as the consequence t oc s —^ 0, any solution of the equation H22|l relaxes towards the unique stationary 
distribution Pgt {F, F) . A rather general proof of this fact is presented inii. 

We shall now use (|25II26I) to get explicitly the stationary distribution function Pst{F) of F. Putting to zero the 
LHS of (^51) one gets that Qi,st{F) does not depend on F. Taking into account the boundary condition l(^ one 
concludes that it is equal to zero: 

Qi,st{F) = 0. (30) 

Putting to zero the LHS of (pS)) and using (|^ we get 

dQ2MF) 



dF 



-V'iF)Qo.st{F)- (31) 



It remains to determine Q2,st{F) putting to zero the LHS of Eq. H27|) . One can conjecture that the stationary state 
Pst{F, F) is symmetric with respect to F ^ — F, and then Q^^stiF) = in the same way as for Qi^st{F) in H3U|I . 
Alternatively, one can assume that 7(F) and D are sufficiently large so that the term dQz^st{F) / dF can be simply 
dropped in the RHS of (|27|l . If V"{F) is of order one, then a large 7(F) is realized both for large and small r— . Thus 
we conclude from (|27|l : 

l{.F)Q2,.t{F)^^Qo^,,{F). (32) 



In view of (|31l I32|l one has a single differential equation: 
and gets for QoAF) = Pst{F), 



(33) 



PstiF) (X7(i^)exp 
Pst{F) = AA(1 + Te'5^)exp 



T ... 2 1 



^[l^'(^)f-^ni^) 



fF T{T- Tf) ef" tT^ 



=2/3F 



D D 2D ^ 

where Af is the normalization factor. The white-noise, r ^ 0, limit of Pst{F) was obtained in RefsAi^. 

7[ 



(34) 




FIG. 1: (X) for Ornstein-Uhlenbeck noise with D = 10, T = 1. From right to left; r = 0, r = 10, r = 100. It is seen that, for 
a fixed /, (X) decreases upon increasing r. 



According to lO I34|l the average free energy reads: 

ir d" Hu) {t + i) u'' exp [(^r - ^) u - ^ ^ 



{F)=T- 



where 



J^ du (r + i) u^' exp [(/xr - ^) 









(35) 



(36) 



Note that both integrals in H35|l can be expressed through the gamma-function r(a;) and the confluent hypergeo- 
metric (Kummer) function i_F'i(a, 6; z), since 



' du u' e""'-''" = - a-1-^/2 r f 1 + £ 

2 V 2 



'^^^f^+2'2'4^j+^^^^(^'2'4^j 



(37) 



Similar formulas can be written for /^ duu"^ (Inu)" e°" ^''" for n = 1,2. These representations facilitate numerical 
calculations. 

The average number of broken base-pairs {X) can be calculated from jS] I34|l . Note that for the white- noise situation 
r — > a simple formula is obtained: 



(X) = — '^^^^'^ 



(38) 



D dfi ' 
where ip^fJ-) ~ r(/i)/r'(/i). For fi —> 0, {X) does not depend on temperature and on r and becomes very large: 

{X)=Df-^, (39) 



for / ^ 0. When the external force reaches its critical value, the average number of broken base-pairs diverges in the 
thermodynamic limit. 

To study the influence of r on this unzipping phase transition, one should keep in mind the realistic situation, 
where DNA molecules belonging to different evolutionary classes have different correlation properties of their base- 
sequencesiS. At the same time the concentration (fraction) of AT and GC base-pairs is known to be (approximately) 
equal for sufficiently long DNA molecules in natural conditionsiiSi. Therefore, in comparing two situations having 
different correlation characteristics, it is legitimate to keep fixed the intensity of the noise defined by Eq. iQ — this 
corresponds to fixed concentration of various base-pairs — and to study how the average number of broken base-pairs 
{X) depends on r for some fixed value of /. This dependence is displayed in Fig. Q following to Eqs. l|S1 0B|) . It is 
seen that the behavior of {X) for very small / depends on r rather weakly. Indeed, as follows from Eq. H34|l , for / ^ 
the relevant domain of F contributing into (F) is F ^ —D/f. As it does not depend on r, we get back Eq. l|3!?|l . 
However, a non-trivial dependence on r does exist for moderately small values of /, where as seen in Fig. Q] {X) is 
a decreasing function of r for a fixed /: longer correlations present in the base-sequence increase the stability of the 
DNA molecule, since larger external forces JF needed to achieve the same average amount of broken base-pairs. This 
is our main qualitative conclusion on how a finite correlation length influences the unzipping process. 

B. Arbitrary finite-range correlated noise at low temperatures. 

In the previous section we reduced the non-linear equation H13II with the finite range correlated noise (Q to a 
Fokker-Planck equation, and solved the latter exactly in the thermodynamic limit. The essential feature that made 
this analytical solution possible is that the OU noise has a single and well-defined characteristic time and due to this 
allows representation (|14l I15|l . 

In general it is impossible to solve (|13|) for an arbitrary Gaussian noise, and, in particular, for the situation given 
by Eq. (|10|l : there is no exact Fokker-Planck equation for this case. There is, however, a particular case which allows 
analytical treatment. For very low temperatures, T — > 0, one can approximately substitute V{F) in Eq. H13|l by —fF 
for F < and by an infinite potential wall standing at F = 0. Thus, all values F > become prohibited. For this 
particular form of potential one can get a Fokker-Planck equation for 

P(F,t) = (5(F-F[r,,t])) (40) 

with an arbitrary Gaussian noise in the RHS of Eq. H13|) . The derivation goes as follows. Write Eq. (|13|l as 

^ = / + ^W, (41) 

where the stochastic variable F is restricted to be negative due to the above infinite wall. Differentiating P{F, t) in 
(|40|l over t one gets 

It remains to handle the last term in this equation. One uses Novikov's theoremi^ 

{^{t)5{F - F[r^,t])) ^ ^^ f ^AsK{t~ s)(^6{F - F[nA)^-^) , (43) 

where 6 /6t]{s) is the variational derivative and SF[t]/Sri{s) is obtained from (|41|l : 

where d{t — s) is the step function. Combining (|42l 1431 Ejl we get finally 

dP{F,t) . dPjPt) d^P{F,t) 

~^^-'^^dF~^^' dF^ ' ^^^' 

rt+L 

Dt= dsK{s). (46) 

Jq 
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Eq. H45() should additionally be supplemented by a boundary condition which reflects the presence of the infinite wall 
at i^ = 0. Equation H45I) can be written as the continuity equation 

dPJF.t) dJ{F,t) _ j(^,.f , 9P{F,t) 

where J{F,t) is the probability current. The infinite wall at i^ = is now implemented by requiring: 



/ dFP{F,t) = l, (48) 

^ — OO 

J(0,t) = 0, (49) 



for all t. Conditions H48I I49|) are imposed on any solution of H45|) . 

In the thermodynamic Unfit L ^ oo and t — one gets from the stationarity condition dP{F, t)/dt 



P{F) = ^ exp 



fF 



for F < 0, (50) 



D 

= 0, for F > 0, (51) 

where the total intensity, as given by IJTJ, is finite for the considered short-range correlated situation. Note that in 
the thermodynamical limit conditions 1481 149(1 are satisfied automatically as seen from Eqs. 1(501151(1 . It is now seen 
from (jnj that 

(X) = Df-^ (52) 

which has the same /-dependence as the white- noise case for small /; see ((39(1 . We conclude that, not unexpectedly, 
for low temperatures the behavior of X is determined only the total intensity of the noise. All other details of K{t) 
do not matter. It remains to stress that the present analysis certainly does not apply to the long-range correlated 
situation ((1U() . since the total intensity D diverges in the thermodynamical limit. 

In closing this section, let us note that Eq. ((521) can be applied to finite-range correlated noise that for t ^ L has 
the same autocorrelation function as 1(10(1 . As an example take 

Khit)= a\t\~", for \t\<l, (53) 

= 0, for \t\ > I, (54) 

where I is some parameter that is finite in the thermodynamical limit L -^ oo. Therefore, the noise given by Eq. 1(53(1 
is obviously finite-range correlated. Eq. 1(52(1 now reads 

{X) = -^l'-'^f-^ (55) 

1 — a 

If one chooses to take I ~ (X) then {X) ~ /~^^" as predicted in Ref A However, there is no any a priori reason for 
this choice, and at any rate this result refers to the finite-range correlated noise Kf,-. The real long-range correlated 
situation, where Z ^ L, is still not described by it. 

IV. LONG-RANGE CORRELATED SITUATION: THE FROZEN NOISE LIMIT. 

The present and the next section are devoted to the long-range correlated situation, where according to Eq. I(10() the 
autocorrelation function K{t) of the noise has a power-law behavior with the single characteristic exponent 1 > a > 0. 

To start with, let us consider the case with a —> 0. The noise is now completely frozen: 77(5) in does not depend 
on s. This situation is less physical as compared to that with a > 0. However, it is exactly solvable, and one can 
hope it catches at least some features of the realistic situation where a is larger than zero, but certainly smaller than 
one. This intuitive expectation will be confirmed later on. 

The problem with a = is easily solved from ^. Moreover, the exact solution can be obtained for an arbitrary 
value of L: 

^^g[PLif + ij)], (56) 



9M = - 



(57) 



It is seen that in the thermodynamical hmit L -^ cxd, g[(3L{f+r]) ] behaves as roughly the step-function, g[f3L{f+rj) ] ~ 
9{—rj — /): for any single realization of the noise there is a sharp phase transition with a jump at the realization- 
dependent point / = —rj. Exactly at this point / = —77 one has g{0) = 1/2 and {X) = L/2. 

Let us now study the behavior of {X). Since the noise is completely frozen, the calculation of {X) reduces to the 
averaging over a Gaussian variable with dispersion a. We have 



L 



= {g[l3LU + v)]) 



/27rCT 



exp 



2cr 



g[mf + v)]=j 



d^ 



V2^i^ 



exp 



{i-PIf 



2a/32 



9[La (58) 



where we changed the integration variable as ^ = /3(77 + /). In the thermodynamical limit L — > cx3, we shall obtain 
for {X)/L the main term of order 0{L^), and the first correction to it which will appear to be of order 0{1/L). To 
this end, let us divide the integration in the RHS of (|58|l into three pieces: 



-2/L p2/L 

+ 

30 J-2/L J2/L 

For each piece we shall use the following aproximate expressions obtained from H58|l 
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(59) 

(60) 
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To obtain H60(l and (|62(l we neglected terms of order 0{e~f^^^^^) that is certainly legitimate in the thermodynamic 
limit. For l|?)T|l which corresponds to the second integration piece in if^ . we have taken the value of (/[L^] at ^ = 0. 
The boundary points of L^ were chosen such as to ensure a continuous matching. However, neither the precise value 
oi g[LS,] within the second piece of integration in H59I) . nor the precise values of the points separating this piece from 
the remaining ones are important, since as we show below the contribution coming from this second piece, as well as 
the contributions from the boundary points of the two other integration pieces, produce factors of order 0{1/ LP') at 
best. 



Combining 
L 



with (|S5|l one gets 
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(63) 
(64) 
(65) 
(66) 



One notes that both H65|) and l|66|) are of order 0(1/^2). This can be verified by directly expanding integrals in H65f) 
and H66|l for small 2/L. Skipping these terms, one gets 
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When obtaining the last term in the RHS of (|67|l . we used a tabulated identity for the error function. 

For / not very large as compared to y^, the first term in the LHS of H67|l is dominating: {X) / L is or order one-half. 
In particular, it is exactly equal to one-half for / = 0. The dependence of {X) on / becomes thus very weak for / -^ 0. 
The second, subdoniinant term become non-negligible for / ^ ^/a, where using asymptotic identities (see Appendix 
EJ: 
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one gets from ^7\ noting a — f I \f5: 
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(70) 



Note that for / ^ -/ct, {X) has — within the leading order — the same l//-dependence as it will be in the completely 
homogeneous situation without noise [a = 0). In the considered regime, the noise only renormalizes this behavior 
modifying the subdoniinant terms. 

As compared to X/ L which has a jump at a realization dependent point f — rj, {X) / L is seen to behave smoothly. 
It displays a crossover between small {X) / L for a large / and {X) — L/2 for f — 0: the sharp transition disappears; 
see Fig. |5| This indicates that the situation for the totally-correlated noise is essentially non-self-averaging: in 
the thermodynamical limit the averaged order parameter {X) does not reproduce the behavior of X for a typical 
realization. Recall that for disordered systems all observables like free energy, order parameters, correlation functions, 
etc., depend on the realization of the disorder, i.e. they are random quantities. It is of the immediate interest to know 
their most probable (typical) values, since they will be met in experiments. If for a given quantity its typical value in 
the thermodynamic limit coincides with its average, one speaks on self-averaging; see e.gi^'^'^'^. In practice this means 
that it is sufficient to study averages as they are representative in the single sample measurements. It is known on the 
general ground that in the proper thermodynamic limit, that is when the linear size L of the system is much larger 
than any other characteristic length and provided the distribution of the disorder is finite-range correlated, quantities 
that scale with the volume of the studied random system — these are extensive quantities such as free energy, order 
parameter, but not the statistical sum — are expected to display self-averaging22i24. This result is based on the law 
of large numbers. However, this need not be true if the distribution of the disorder is long-range correlated, since now 
the correlation length of the disorder has the same order of magnitude as the linear size, and the arguments based on 
the law of large numbers do not apply. The above situation is just of this sort. 



A. Dispersion as a measure of non-self-averaging. 



It is desirable to have more quantitative indications of the above indicated non-self-averaging effect. To characterize 
fluctuations of X from one realization to another, it is natural to employ the corresponding dispersion {X'^) — (A)^ 
which tells us how the quantity (A) fluctuates from one realization to another. Then the statement of self-averaging 
will read: 
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In contrast, if ((A^) — (A)^)/(A)^ remains finite for L -^ cx), we have non-self-averaging. 

The quantity (A^) can be calculated in the same way as in Eqs. (|63II67|) . We shall bring the result only for / not 
very large as compared to -Ja, that is, when (A^) c>c L^: 
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Substituting this into (|71|l . we see that ((A^) — (A)2)/(A)2 remains finite in the thermodynamical limit: 



jx') {xy 

(A)2 



pOC 

J0f 



dC 



/3/ ^27ro-/32 



exp 



2ap 



- 1. 



_d|_ 



/3/ ^27ro-/32 



exp 



25^ 



(72) 



(73) 



In particular, for / — > 
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1, 



11 



(74) 



indicating essential non-self-averaging. 

In closing this section, let us repeat that the character of the thermodynamical for the considered case a = is 
different from that of the finite-range correlation situation, where — for L ^ oo — the behavior of {X) become 
independent on L at least in the physical range of other parameters (e.g., 1 > / > 0, T > 0, etc). For the a = case, 
as seen from Eqs. (|67l I70|l . there is an explicit dependence on L in the whole range of physical range of the involved 
parameters. According to (|70|l . if L is kept large but finite, then this dependence is very weak for external forces far 
form their critical value f — 0, that is, for / ^ ^/a. There are no reasons for taking this explicit dependence on 
L as something unphysical. In contrast, the actual size of physically relevant examples of DNA is never more that 
L ^ 10^ — 10^; see Refii. This is certainly much smaller than the number 10^^ which in the standard statistical 
physics is taken as the typical size. Therefore, it is rather natural to study the physics of unzipping for a large but 
fixed L. 

For the considered frozen situation, we could solve the problem analytically for a given realization. However, for 
a > this is not possible, and one has to rely on numerical methods. This is what we intend to do in the next section. 

1^ 



0.8 



0.6 



0.4 



0.2 



0.5 1 



2 2.5 



f 



FIG. 2: X{f)/L for a particular realization (solid curve) and {X{f) )/N (dotted curve) versus f;T — a = l, and L — 10* 



V. NUMERICAL RESULTS. 

As we have seen in the previous section, there are reasons to expect that for the long-range correlated situation, 
especially for sufficiently small index a, the typical — that is, frequently met among many independent realizations 
of the noise — behavior of X(f) in the thermodynamic limit is not described adequately by the average quantity 
{X) (non-self-averaging). We note in this context that the correlator ((A^) — (A)^)/(A)^ studied in section HVI can 
indicate on non-self-averaging, but by itself does not provide any direct information on typical realizations. It is 
perhaps needless to stress that once we expect the effect of non-self-averaging, the attention should be shifted towards 
typical realizations, since they do have a direct physical meaning for single-molecule experiments. 

In the present section we study numerically the behavior of the number of broken base-pairs A as a function of / 
both for the long-range correlated situation and for the uncorrelated noise. For the discrete version of the model the 
partition function reads: 



Z = ^exp[-/3(/fc + ^r7,)], 



fc=i 



(75) 



j=i 



where for the long-range correlated situation rji are Gaussian random variables with the autocorrelation function 
given by (|1(J|I . Note that for the purposes of numerical computations the behavior of K{t) in H1U|) was regularized at 
short distances so as to avoid superfluous short-range singularities; see Appendix ^ for details. The generation of ?/;, 
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i — 1, ..., L is described in Appendix IXI following to optimized recipes proposed ir?^ . For numerical computations we 
have chosen T = 1 and L — 10'' or L — 5 x 10'*. 

As L is now explicitly finite, one should be careful with the selection of the thermodynamical domain, since due to 
the very statement of the problem the limit L — > oo is taken before / — > 0. As a plausible estimate of this domain, 
one can use a condition fvL 3> 1. We confirmed it in several ways, reproducing predictions which were made in the 
thermodynamica limit L ^ oo. 

A. Uncorrelated noise. 

Let us start with the uncorrelated noise case, where ?7i's are independent Gaussian variables with zero average, 
{i]i) — 0, and variance (ryf) — D = 1 (white noise), and where X is given by (|5i r75|l . For comparison we also studied a 
case, where rji are independent random variables assuming values rji = ±1 with equal probability (dichotomic noise). 

The results are illustrated by Figs. (PJEJ, where we display {X) and X for several typical realizations. It is seen 
that {X) and X do not coincide exactly, as it is in general expected due to the finite magnitude of L if not by any 
other reason. However, in the considered thermodynamical domain of / the behavior of various typical realizations 
qualitatively resembles each other, and, therefore, resembles that of {X{f)). In particular, for all typical realizations 
X{f) grows for / ^ 0. In that sense / = is a special point for both typical X and {X). It should be mentioned that 
for / < 0.05 we have seen realizations containing relatively sudden jumps at realization-dependent values of /. This 
differs from the behavior of {X) and is in agreement with results of Refi^. However, such small values of / are not in 
the thermodynamical domain. Acknowledging reservations connected with the numerical character of our study, we, 
nevertheless, conclude that the uncorrelated-noise situation is self-averaging at least for not very small, /VX ^ 1, 
values of /. 

It is seen from Fig. Q) that the white and dichotomic noise produce very similar results. This is to be expected 
for the considered large values of L (law of large numbers). Fig. ^ shows that the power law {X) ex f~^ for the 
white-noise case is recovered by direct averaging over various realizations. Indeed, it is seen from this figure that one 
recovers 



(X) cc f 



-1.84 



(76) 



after averaging over 10'^ realizations in the domain 0.05 < / < 0.25. This result is stable upon increasing the number 
of realizations, e.g. from 10'^ to 2 x 10^. 
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FIG. 3: Solid curves: X{f) for several realizations of the white uncorrelated noise. Dotted curve: (X{f)) obtained by averaging 
over 10^ realizations. T = D = 1, L = 5 x lO". 
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FIG. 4; Solid curves: X{f) for several realizations of the dichotomic uncorrelated noise {rji = ±1 with equal probability). 
Dotted curve: {X{f)) obtained by averaging over 10^ realizations. T = 1, L = 5 x 10*. 
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FIG. 5: The dependence of — ln(X(/)) on In / for various values of L and for T = D = 1. The quantity (X{f)) was obtained 
by direct averaging over lO'^ realizations. Solid line idicates linear fitting — ln(X(/)) = A + 1.84179 In / for L — 5 x 10*, where 
A is a constant. The emergence of the power-law ()76^ is thus displayed explicitly. 



B. Long-range correlated noise 



Typical realizations. 



The situation for the long-range correlated noise for a = 0.5, T = cr = 1 is illustrated by Figs. (0 |S1 [7| 13 . The first 
point to note is that now there are typical realizations with radically different properties. The first type of realizations 
is presented by Figs. 100): X{f) increases by several sudden jumps followed by flat regions. It is seen that X|/=o is 
either equal to its maximal possible value L or is close to it. Points where X{f) has jumps vary from one realization 
to another. However, the overall number of jumps when varying / between zero and one is typically two or three. 
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FIG. 6: Realizations of X{f) from the first class of typicality. T = a = 2a = 1, L = 10*. 

In contrast to this, Figs. (|H1 13 presents a strictly different situation: It is characterized by very smooth behavior 
of X{f) for / > 0. In particular, X\f=o is much smaller than L (typically by few orders of magnitude). X{f) is still 
a monotonic function of /, but the point / = — where the energy supplied by the external unzipping force is equal 
to the average binding energy of a base-pair — is by no means special. 

To estimate the frequency by which each scenario is met among all possible realizations, we have taken the following 
criteria for deciding whether a given realization belongs to one of the above classes: for L — 5 x 10^ we prescribe the 
given realization to the first class if X{f = 0) > 4.8 x IC*, while it is prescribed to the second class if X{f = 0) < 10^. 
These criteria appeared to be sufficiently adequate, as they are consistent with the fact of presence (for the first class) 
of absence (for the second class) of sudden jumps for X{f). 

In this way the frequencies of each class were estimated in a sample of lO'^ realizations. It appeared for L — bx 10^ 
and T = a = 2a — 1 that the first scenario is met in ^ 84% of all cases (839 in 10"^ realizations), while the second 
scenario is present in ~ 12% of all cases (118 in 10'^ realizations). These fractions are stable upon increasng the size 
of the sample on which the above estimation were carried out. Interestingly enough, realizations where X{f) as a 
function of / fall into neither of the above two classes amount only to ~ 4% of all possible cases. 

It is relevant to note that the fractions of the two classes show tendency to move towards each other upon decreasing 
the size of the system. For instance, the fractions of the first and the second class amount to 18% and 76%, respectively, 
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FIG. 7: Realizations of X{f) from the first class of typicality. r = o- = 2a = l, L = 5xlO 
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FIG. 8: Realizations of X{f) from the second class of typicality. T = a — 2a — 1, L — 10* 



for L = 10-* (T = cr = 2a = 1). These fractions were estimated by criteria X{f = 0) > 0.8 x 10^ and X{f = 0) < 10^, 
respectively. Recall in this context that the chosen values for L are sensible, since the typical DNA samples used in 
experiment have L ~ 10'' — 10^; see, e.g.ji^ and also section IVB 21 

Since there are typical realizations which are so much different from each other, we conclude that this long- 
range situation is essentially non-self-averaging in the whole physical domain < / < 1 and, in particular, in 
the thermodynamical domain of /. This fact distinguishes between the uncorrelated (white noise) and long-range 
correlated situations. It should be noted that due to the law of large numbers any non-self-averaging present in 
the whole domain < / < 1 is certainly impossible for the uncorrelated (or weakly correlated) noise22*24. For the 
long-range correlated case the very law of large numbers does not apply, and the above effect becomes possible. 

Our discussion of the frozen noise presented in section IIVI allows to provide a qualitative explanation for features 
of the above two classes of typical realizations. One notes that a sizeable portion of long-range correlated noise 
realizations can be seen as several pieces of the frozen noise with different ry's put next to each other. Now recall from 
(|SS|l that every sufficiently long piece of that type has a single first order phase transition with a jump proportional 
to its length. 

The same reasoning can be applied for the understanding of the existence of the second class, where X{f) is a 
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FIG. 9: Realizations of X{f) from the second class of typicality. T — a = 2a = 1, L = 5 x 10*. 
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FIG. 10: —\n{X) versus In/ for the long-range correlated noise with various a's and L = 5 x 10^, T = a = 1. The quantity 
{X{f)) was obtained by direct averaging over 10^ realizations. 

smooth function of / and X{f — 0) -^ L. Here one should note that — within the above quahtative image of a 
long-range correlated random sequence — there are realizations of the noise where all 77's are positive, and thus all 
jumps of X{f) can occur only for negative / < 0, that is, beyond the domain of our interest. 

2. Inferring phase transitions. 

Let us finally discuss on whether we can infer phase transitions by studying the typical realizations. First of all, 
it is obvious that once we do not have self-averaging, phase transitions should be studied on typical scenarios of 
behavior for X and not on the behavior of its average {X). There is another aspect which is certainly more subtle: 
phase transitions are typically defined in the thermodynamical limit and one needs special tools of finite-size scaling 
for their identification in results of numerical computations which are necessarily done on finite systems. The idea of 
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FIG. 11: -in ({X) {L/2)-^-^'-"A versus In/ for various L's and T = a = 1, a = 0.25, 5 = 0.0625. The quantity {X{f)) was 



obtained by direct averaging over 10 realizations. 
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FIG. 12: The same as in Fig. dJ but with a = 0.5 and 5 = 0.075. 



the finite-size scaling is thus to extrapolate these results to the thermodynamical Hmit. However, there is another, 
somewhat different hne of thought^^ which identifies the proper thermodynamical quantities (such as entropy, free 
energy, order parameter, etc) directly for finite systems, and then searches in the space of parameters some points 
having a special character for this quantities. This approach well-recommended itself for studying phase transitions 
in atomic and nuclear physics, and in systems with long-range interactions (e.g., a gas of self-gravitating particles). 
For the present study of DNA there is a related aspect that should be taken into account: in natural conditions the 
number of base-pairs is large, hut finite. Here L is of order of 10^ ~ 10^, see e.gi^, as we have mentioned already. 
It is, therefore, clear that the considered finite size aspect of DNA is something generic, and not only connected with 
natural limitations of numerical methods. 

Let us now return to the situation presented in Figs. (jHUZHHJ- We are going to use the analogy with the case of 
the totally correlated noise described in section Hvl It was seen already that this analogy helped us to draw useful 
qualitative conclusions on the numerical data. For the totaly correlated noise the point of the phase transition is 
anambiguously identified with the realization dependent value / = ~rj. At this point the order parameter X has a 
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FIG. 13: The same as in Fig. O but with a = 0.75 and 5 = 0.08. 
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FIG. 14: S{a) defined by Eq. ^ versus a. 
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jump of order L; see Eqs. H56I 157(1 . It may be useful to repeat that the most unusual aspect of this phase-transition 
is that its point is strictly realization-dependent. The same philosophy can now be applied to Figs. 00): there 
are realization-dependent values of /, where X has jumps of order of L/2 (recall that for the figures we have taken 
L = 10^ or i = 5 X 10^). It is seen as well that there can be several such phase-transitions for a single system 
(single realization of noise). The latter fact can by itself appear to be rather surprising. However, it is known that 
some disordered24, or deterministic but strongly-frustrated^^, systems can experience several phase transitions; there 
can even exist quasi-continuous domains of criticalitjs^^. With the same logic one sees that the typical realizations 
presented in Figs. ||H1 © do not have phase transitions in the domain < / < 1 at least for the considered values of 
L. 



C. The behavior of {X). 



As we already noted, once the effect of non-self-averaging is present, the basic physical quantities are the typical 
realizations, since it is these features that are directly observed in experiments. It is, however, still of relevance to 
know the behavior of the average number of unzipped bonds {X) , since it illustrates what are the precise differences 
as compared to typical realizations. 

Here we report on two features of {X) as a function of /. The first one is how does {X) depend on / for small 
values of /. In particular, is there any power law dependence similar to {X(f)) ex f~^ present in the uncorrelated 
noise situation, and verified by us numerically in section IVAf Note that for the long-range correlated situation with 
the index a such a power law 



{X) (X /-2/° 



(77) 



was recently predicted for small /; see Refi^. The most adequate way to look for the power-laws is to plot — hi(X) as 
a function of In/, then a power-law should display itself via a straight line. Fig. 1(1 ()|l display such a plot obtained for 
various values of a and L = 5 x 10^. The quantity (X) was calculated by direct averaging over 10^ realizations and 
the results were checked for stability upon increasing (by two times) the number of realizations. As seen, this figure 
shows very weak dependence of — In(A') on In/. There are no convincing indications of a power law. In particular, 
when decreasing a the dependence of — ln(X) on In/ does become weaker, in obvious contrast with the prediction 
made by Eq. ((77(1 . For a — > this behavior coincides with those of the exact solution discussed in section Hvl 

It should be noted that —\n{X) is a perfectly smooth function of In/: all jumps and flat regions present for the 
first class of typical realizations — which involves the majority of realizations — became washed out when averaging 
over 10'^ realizations. This gives another indication that the point of jumps in the above class are completely random 
and vary from one realization to another. 

Once we realized that in a rather wide interval of /'s — typically In / < —0.5, as seen in Fig. ((10(1 — the dependence 
of {X) on / is weak, we have studied the behavior of {X{f = 0)) as a function of L and a. As shown by Figs. ((Ill 



1121 I13|l numerical results fit well into the following scaling equation: 
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(^(/ - 0)) (X (^- j . (78) 

The values of 6{a) for several a's are shown by Fig. H14() . For a w 0, we get S — 0.01 which is in a good agreement 
with exact value 6{a — 0) ~ got in section HVI 

Two important features of result H78|) are to be mentioned. First, as seen from Figs. (|lll 1121 I13II the value of 
{X){f = 0) adequately characterizes the whole domain of small /, since the dependence of {X) on / is weak. Second, 
as seen from Fig. (|14() . the function 5{a) increases with a, but saturates for a > 0.5 at (5 = 0.08. It appears that the 
same result (I78f) with the index 5 = 0.08 holds for the uncorrelated noise, but there its region of validity is restricted 
(for L = 5 X 10**, T = A = 1) by very small / < 0.01 values of /, in contrast to the long-range situation. Thus, as far 
as the small-/ characteristics are concerned the result (|78ll seems to be universal, and it is likely that 5{a) can have 
the same status as critical indices in the usual theory of phase transitions. 

We conclude by repeating two main qualitative features of the average number {X(f)) of unzipped bonds as revealed 
by our numerical analysis: in the long-range situation and for small forces /, the behavior of {X{f)) as a function of 
/ does not display any power-law, and is governed by its value at / = 0. The latter one satisfies to power-law ()78f) as 
a function of L. 

VI. SUMMARY AND CONCLUSION. 

In the present paper we have studied how statistical correlations present in the base-sequence of a DNA molecule 
influence the process of unzipping. There were two related motivations for our study. On the one hand, the existence 
of these correlations — that can have both finite-range and long-range character — is by now a well-established 
£g^p.jj9jio,iiji2_ j^ jg^ therefore, legitimate to study how they infiuence on the DNA physics. On the other hand, general 
qualitative predicitions drawn on the above influence can be used for explaininig the reason of rich correlated structures 
found in the base-sequence of DNA. Recall that various segments of a DNA molecule can have different — finite-range 
or long-rangeiiii^ — correlation structures. Moreover, DNA molecules belonging to different evolutionary classes have 
different correlation properties of their base-sequencesiS. 

The model we studied contains only the most minimal number of ingredients needed to describe unzipping, and 
to account for correlations in the base-sequence of DNA. Therefore, many realisitic features of the unzipping process 
remain beyond of our study. We, nevertheless, believe that the obtained results will be useful especially for drawing 
qualitative conclusions. 

Let us now shortly summarize our results starting from the finite-range correlated situation. In section IIII Al we 
have shown that the presence of a finite correlation length t plays a stabilizing role for the unziping process: for a 
fixed external force / the average number of broken base-pairs decreases under increasing of r. If only finite-range 
correlations are present, the process of unzipping does not depend much on the detailed structure of the base-sequence: 
all typical — i.e., frequently met among all possible base-sequeces — scenarios of unzipping have the same qualitative 
pattern of behavior, that is, the number of the broken base-pairs diverges as the external force approaches its critical 
value: / ^ 0. This divergence can be adequately understood by studying the average — over all possible base- 
sequences — number of broken base-pairs. All by all, one can say that the basic influence of finite-range correlations 
is in stabilizing the DNA molecule with respect to the external unzipping force. 

The influence of long-range correlations is certainly more drastical. Possibly the most important aspect is that 
the situation is essentially non-self-averaging: there are two radically different scenarios of typical unzipping which 
depend on the detailed structure of the base-sequence and which do not coincide with the behavior averaged over all 
possible base-sequences. Within the first scenario, the number of broken base-pairs X{f) shows as a function of the 
external force / a sequence of sharp jumps at sequence-dependent values of /. The overall number of jumps is nearly 
constant within the class. Each jump has the magnitude comparable with L, that is, under small change of / a large 
number of base-pairs can be opened. The point / = is special, since X{0) either coincides with L, or at least is 
very close to it. We argued in section^that it is sensible to describe this scenario as a sequence of phase-transitions. 
Such an effect is known from other disordered or strongly-frustrated system a^^i^^ . 

The second typical scenario is crucially different. Now X is a smooth, slowly changing function of the external 
force / in the whole relevant domain < / < 1. There is no any sign of phase transition, and the value / = is not 
distinguished from / > as far as X is concerned. DNA molecules which due to the structure of their base-sequence 
fall into this class are thus rather stable with respect to the external unzipping force. 

It appears, interestingly enough, that the qualitative and even some quantitative feautures of the long-range corre- 
lated situation can be understood via the analytical solution of the model with the totally correlated (frozen) noise. 
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which we presented in section IIVI In particular, this allows to explain why there exist two typical scenarios with 
widely different behavior of the number of unzipped base-pairs, and provides rather robust analytical indications for 
the phenomenon of non-self-averaging. 

Summarizing features of these two scenarios, one can say that long-range correlations increase the adaptability of 
the corresponding DNA molecule, since in some typical scenarios it becomes more stable with respect to the force 
(any sharp transition is absent), while in others the unzipping is realized via a sequence of sharp phase transitions. 
The actual scenario for a single molecule will crucially depend on the detailed structure of the base-sequence. 

We also studied how the average number (X) of unzipped base-pairs depends on the applied force /. In contrast to 
white-noise situation, where the behavior of {X) for small (that is, critical) forces / -^ is governed by a power-law 
(X) ~ f~^, we found numerically no indications of a power-law for small forces in the long-range correlated situation. 
In contrast, the dependence of {X{f)) on / for small /'s is very weak and to a large extent is governed by {X{f = 0)). 
The latter quantity displays a power-law behavior H78I) as a function of L. The region of validity of this power-law 
appeared to be unexpectedly wide. 

We hope that these results will contribute into understanding of the role and the purpose of correlations structures 
in DNA. 

Acknowledgments 

This work of Zh.S. Gevorkian, C.-K. Hu and W.-C. Wu was supported in part by a grant from National Science 
Council in Taiwan under Grant NSC 92-2112-M-001-063. 

The work of A.E. AUahverdyan is part of the research programme of the Stichting voor Fundamcnteel Onderzoek 
der Materie (FOM, financially supported by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO)). 

Zh.S. Gevorkian acknowledges interesting discussions with A. Maritan and D. Marenduzzo. 



* Electronic address 
' Electronic address 

* Electronic address 
^ Electronic address 



4 



'armena@science.uva.nl" 
gevorkia@phys.sinica.edu. tw| 
huck@phys.sinica.edu.tw 
mcwu@phys.sinica.edu.twl 

D. Freifelder and G.M. Malacinski, Essentials of Molecular Biology, (Jones and Bartlett Publishers, Boston & London, 1993). 

A.Yu. Grosberg and A.R. Khokhlov, Statistical Physics of Macromolecules, (American Institute of Physics, New- York, 1994). 

S.M. Bhattacharjee, J. Phys. A, 33, L423 (2000). K.L. Sebastian, Phys. Rev. E, 62, 1128 (2000). E. Mukamel and E. 

Shakhnovich, cond-mat/0108447 E. Orlandini, et al., cond-mat/0109521 S. Cocco, et al., |cond-mat/0206238| 

S.M. Bhattacharjee and D. Marenduzzo, J. Phys. A, 35, L141 (2002); cond-mat/0106110l 
^ D. Marenduzzo, A. Trovato and A. Maritan, Phys. Rev. E, 64, 031901 (2001); Phys. Rev. Lett., 88, 028102 (2002). 
^ D.K. Lubensky and D.R. Nelson, Phys. Rev. Lett., 85, 1572 (2000); Phys. Rev. E, 65, 031917 (2002). 
'' B. Essevaz-Roulet, U. Bockelmann and F. Heslot, Proc. Nat. Acad. Sci., 94, 11935 (1997); T.T. Perkins, D.E. Smith and 

S.Chu, Science, 264, 819 (1994). R. Merkel, et al.. Nature, 397, 50 (1999). 
* T.R. Stick, et al.. Rep. Prog. Phys., 66, 1 (2003). 
^ W. Li, Int. J. Bif& Chaos, 2, 137 (1992). I. Amato, Science, 257, 74 (1992). W. Li and K. Kaneko, Eur. Phys. Lett. 17, 655 

(1992). O.K. Peng, et al.. Nature, 356, 168 (1992). S. Buldyrev, et al., Phys. Rev. E, 47, 4514 (1993); ibid, 51, 5084 (1995). 
° R. Voss, Phys. Rev. Lett. 68, 3805 (1992). X. Lu, et al., Phys. Rev. E, 58, 3578 (1998). 
^ C.A. Chatzidimitrou-Dreismann and D. Larhammar, Nature 361, 212 (1993). V.V. Prabhu and J.M. Claverie, Nature 359, 

782 (1992). A.K. Mohanti and A.V. Narayana Rao, Phys. Rev. Lett. 84, 1832 (2000). B. Audit, et al., Phys. Rev. Lett. 86, 

2471 (2001). S. Guharay, et al., Physica D 146, 388 (2000). 
" W. Li, Comput. Chem. 21, 257 (1997). 

M. Vieira, Phys. Rev. E 60, 5932 (1999). 

E.S. Mamasakhlisov, et al., J. Phys. A 30, 7765 (1997). 

M. Opper, J. Phys. A 26, L719 (1993). C. Monthus and A. Comtet, J. Phys. I (France) 4, 635 (1994). 

M.Ya. Azbel, Phys. Rev. Lett, 31, 589 (1973). 

H. Risken, The Fokker-Planck Equation, (Springer- Verlag, Berlin, 1989). 

P. Jung and P. Hanggi, Phys. Rev. A 35, 4467 (1987) 

R. Fox, Phys. Rev. A 33, 467 (1986) 

A.H. Romero and J.M. Sancho, J. Comp. Phys., 156, 1 (1999); ' cond-mat/9903267 ' H.A. Makse, et al., Phys. Rev. E, 53, 

5445 (1996). 

Note, however, that there are exclusions from this rule for certain bacteriai; for them the concentration of GC pairs can 

differ substantially from that of AT pairs. These situations are, nevertheless, fairly rare. 

D.H.E. Gross, Microcanonical Thermodynamics: Phase Transitions in Finite Systems, (World Scientific, Singapure, 2001). 



21 

^3 R. Brout, Phys. Rev. 115, 824 (1959). 
^* B. Derrida, Phys. Rep., 103, 29 (1984). 

2^ A. E. AUahverdyan, N. S. Ananikian, and S. K. Dallakian, Phys. Rev. E, 57, 2452 (1998). 

^® W.H. Press, S.A.Tenkolsky, W.T.Vetterhng and B.P.Flanneri, Numerical Recipes in Fortran: The Art of scientific Computing, 
2nd edn. (Cambridge University Press, USA, 1992). 

APPENDIX A: GENERATION OF THE LONG-RANGE CORRELATED NOISE. 

Using ideas oSSi we shall here describe a method for numerical generation of a Gaussian random noise ri(t) with 
zero average and an arbitrary symmetric autocorrelation function: 

K{t-t')^{rj{t)rj{t')), (Al) 

K{t) = K{-t). (A2) 

Assume that the noise is periodic with period M: 

j^{t)^rj{t + M). (A3) 

Therefore K{t) is also periodic with the same period and can be expanded as 

K{t)^ Y, fc„e-"-°*, ^0^^, (A4) 



n— — oo 



where fc„ is given by Fourier formula: 



1 rM/2 
fc„ = — / dii^(i)e™"°*. (A5) 

^W J-M/2 



Since K{t) is a real and symmetric function, fc„ = fc* = fc_„, and thus 

2 i.M/2 

kn = — / dtK{t) co8{nujQt). (A6) 

It is now straightforward to see that the noise rj we are looking for is represented as 

oo 

ri= Y. Vk~nrine-'^^''\ (A7) 



n^ — oo 



where rjn are complex Gaussian random variables with 

{•qnTlm) = 5{ji + m), (A8) 

where 5(0) = 1 and 5{k) = for /c ^ 0. Indeed, once r]n are assumed to be Gaussian, r]{t) is Gaussian as well; it is 
seen as well that (jAljl is valid. Complex random variables rjn can be conveniently expressed via real random variables: 

rjn = — = (a„ + ibn) , for n> 1, (A9) 

r\n = —r= (a„ - ihn) , for n < -1, (AlO) 

770 = ao, (All) 

where are quantities a„ and 6„ are independent, zero-average Gaussian random variables normalized to one: 

(ofea;) = 4i, (fefefe/) = 4i, (flfc^/) = 0. (A12) 

Using this one writes 



J] — 2_. V 2fcn[an cos(na;ot) + &n sin(n(jJoO ] + v ^0 iQ- (A13) 
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Let us consider an example: 

K{t) = a, for t <1, 

= -^, for t > 1. (A14) 

This represents a long-range correlated noise regularized for small t. For this autocorrelation function the coefficients 
kn read from IIA6II: 



I V^ l\ tTsin(nci;o) , 2cr 

fco = 2cr —p= - -r-7 , fc„ = 



'-^(^)---(tS) 



\y/Mj2 



(A15) 



where i^c(2;) is Fresnel's C-fmiction: 

Fc{x)^ £dt cos (^Y (A16) 

1. Numerical implementation. 

Eqs. (JA13I IAf5|l are sufficient for generating long-range correlated, periodic Gaussian random noise. However, for 
numerical implementations this noise has to be discretized. First we note that the above method produces periodic 
random noise (with period M), while our problem does not have any periodicity. Therefore, we have chosen M = 2L, 
and took discrete values of i — 1, 2, ..., L in Eq. (|Af3|l . thereby generating L long-range correlated random numbers 
without any periodicity. Note that (|A13p contains infinity as the upper limit of the summation in its RHS. For 
numerics this infinity should obviously be substituted by some number larger than L, and additionally one should 
check that the situation is stable with respect of varying this number. As for concrete calculations we have used, e.g., 
L = 10'', we found sufficient to take for this upper summation limit 10"*. 

Numerical simulations in section were performed by using the gaussian independent random variables generated 
by the "gasdev" algoritm of Ref.^®. The long-range correlated noise was generated following the scheme proposed in 
Appendix A. 

APPENDIX B: DERIVATION OF TWO ASYMPTOTIC RELATIONS. 

Here we derive the following asymptotic identities used in the main text: 



1 - ^ + . . . , a » 1, (Bl) 



V27r aV^TT \ a^ 

^dee«'/2= rdee«'/2 = ^fl + 1 )+..., a»l. (B2) 



/o Jo ^ 

The first one is easily done via integration by parts: 



<'« ,-e,, = _ /" a£4in = fzi! + /" Sfia. (b3) 



\/27r J a CV^TT aV^TT J a ^^ V27r 

For the second relation one notes that for a 3> 1 the relevant domain of integration is ^ '^ a. In more details 



p«V2_^aV2 / ^^^-ae+e/2 _ 6° ^^ f" r\„ ^^V+v'' / i'^'^^) 



d^e^/^-e'^/^ / d^e-''^+^^'^- / dye^y+y^^^^K (B4) 



Now one can expand inside of the second exponent in the RHS of IIB4II , since the main contribution to the integral 
comes from y ~ (the other side, that is y '--^ a^, is strongly suppressed as seen): 

/ djye-'^^+^'/(2«') = ^ / dye-Ml + :f^ + ...). (B5) 

a Jo a Jo V 2a^ / 

Neglecting exponentially small terms, one gets get finally HB2|I . 



